Laplace's Rule of Succession in Information Geometry

نویسنده

  • Yann Ollivier
چکیده

When observing data x1, . . . , xt modelled by a probabilistic distribution pθ(x), the maximum likelihood (ML) estimator θML = argmaxθ ∑︀t i=1 ln pθ(xi) cannot, in general, safely be used to predict xt+1. For instance, for a Bernoulli process, if only “tails” have been observed so far, the probability of “heads” is estimated to 0. Laplace’s famous “add-one” rule of succession (e.g., [Grü07]) regularizes θ by adding 1 to the count of “heads” and of “tails” in the observed sequence. Bayesian estimators suffer less from this problem, as every value of θ contributes, to some extent, to the Bayesian prediction of xt+1 knowing x1:t. However, their use can be limited by the need to integrate over parameter space or to use Monte Carlo samples from the posterior distribution. For Bernoulli distributions, Laplace’s rule is equivalent to using a uniform prior on the Bernoulli parameter. The non-informative Jeffreys prior on the Bernoulli parameter corresponds to Krichevsky and Trofimov’s “add-one-half” rule [KT81]. Thus, in this case, some Bayesian predictors have a simple implementation. We claim (Theorem 1) that for exponential families1, Bayesian predictors can be approximated by mixing the ML estimator with the sequential normalized maximum likelihood (SNML) estimator from universal coding theory [RSKM08, RR08], which is a fully canonical version of Laplace’s rule. The weights of this mixture depend on the density of the desired Bayesian prior with respect to the non-informative Jeffreys prior, and are equal to 1/2 for the Jeffreys prior, thus extending Krichevsky and Trofimov’s result. The resulting mixture also approximates the “flattened” ML estimator from [KGDR10]. Thus, it is possible to approximate Bayesian predictors without the cost of integrating over θ or sampling from the posterior. The statements below emphasize the special role of the Jeffreys prior and the Fisher information metric. Moreover, the analysis reveals that the direction of the shift from the ML predictor to Bayesian predictors is systematic and given by an intrinsic, information-geometric vector field on statistical manifolds. This could contribute to regularization procedures in statistical learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification and prioritizing Succession Management Barriers in Shiraz Petrochemical Company by mixed method

The purpose of this study was to identify and prioritize the barriers of succession management in Shiraz Petrochemical Company. The mixed method has been. In qualitative stage, by means of thematic analysis and semi-structured interviewing, the obstacles of succession management were explored. The potential participants in this stage were 30 people including experts in the Shiraz Petrochemical ...

متن کامل

A Formal Theory of Inductive Inference. Part I

The following sections will apply the foregoing induction systems to three specific types of problems, and discuss the "reasonableness" of the results obtained. Section 4.1 deals with the Bernoulli sequence. The predictions obtained are identical to those given by "Laplace's Rule of Succession." A particularly important technique is used to code the original sequence into a set of integers whic...

متن کامل

AMultiresolution Mesh Generation Approach for Procedural Definition of Complex Geometry

As a general approach to procedural mesh definition we propose two mechanisms for mesh modification: generalized subdivision and rule based mesh growing. In standard subdivision, a specific subdivision rule is applied to a mesh to get a succession of meshes converging to a limit surface. A generalized approach allows different subdivision rules at each level of the subdivision process. By limit...

متن کامل

A Multiresolution Mesh Generation Approach for Procedural Definition of Complex Geometry (color plates 1, 2, 3, 4, 5, and 6)

As a general approach to procedural mesh definition we propose two mechanisms for mesh modification: generalized subdivision and rule based mesh growing. In standard subdivision, a specific subdivision rule is applied to a mesh to get a succession of meshes converging to a limit surface. A generalized approach allows different subdivision rules at each level of the subdivision process. By limit...

متن کامل

The Role of Geometry of Yard in the Formation of the Historical Houses of Kashan

Geometry is a base tool for establishing unity in Iranian architecture and is always considered by architects due to the discipline and rule that gives architecture to architecture. The architecture of the house in terms of its specific functional role, sought to adapt the geometrical principles to the best possible shape and achieve the proper understanding of the proportions and proportions o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015